Skip to content

feat(cli): print session summary at end of exec runs#3278

Closed
Sayt-0 wants to merge 1 commit into
mainfrom
feat/exec-session-summary
Closed

feat(cli): print session summary at end of exec runs#3278
Sayt-0 wants to merge 1 commit into
mainfrom
feat/exec-session-summary

Conversation

@Sayt-0

@Sayt-0 Sayt-0 commented Jun 26, 2026

Copy link
Copy Markdown
Member

Summary

Adds an end-of-run session summary for non-interactive (--exec) runs. Until now --exec printed no usage recap, while the TUI only exposed the on-demand /cost dialog. The recap covers interaction stats (tool calls, success rate), per-model token usage, total cost, and cache savings, matching the request in #1345.

The data is derived from the existing session through a new Session.Stats() helper, so no new tracking pipeline or protocol events are introduced.

Sample output

─── Session summary ───
Session:       7532d5a9-92e9-4197-ae26-34855d7167f7
Duration:      1s
Requests:      2
Tool calls:    1 (1 ok, 0 failed)
Success rate:  100.0%

Model usage
MODEL           REQS   INPUT   CACHE READ   OUTPUT
openai/gpt-4o      2   1,600            0       20

Tokens:  1,600 in / 20 out
Cost:    $0.0042

Mapping to the request (#1345)

Requested field Status
Session ID done
Tool calls with success/failure split done
Success rate done
Per-model usage (requests, input, cache reads, output) done
Token totals and cache savings done
Wall-clock duration done
Agent-active / API-time / tool-time split deferred (needs runtime timing instrumentation)
User Agreement dropped (ambiguous metric)

Design notes

  • The recap is written to stderr, so stdout stays machine-consumable. The existing exec stdout contract (asserted by the e2e TestExec_* tests) is unchanged.
  • It is skipped in JSON output mode, where structured events are emitted instead.
  • It is a no-op when the session recorded no measurable activity.
  • No observability stack is added. General observability remains the role of OTEL, consistent with the feedback on the earlier closed PR runtime: add session-level observability plumbing #1436.

Validation

Check Result
task build pass
golangci-lint on pkg/session, pkg/cli 0 issues
Unit tests (pkg/session, pkg/cli) pass
e2e (TestExec_*) pass (stdout unchanged)
Manual real run (replay cassette, plain and tool-call) recap on stderr, stdout clean

Out of scope / possible follow-ups

  • API-time vs tool-time split (requires lightweight runtime timing).
  • A flag to suppress the recap (currently it can be silenced with 2>/dev/null).
  • Surfacing the same stats in JSON output mode.

Non-interactive runs previously emitted no usage recap; the TUI only had
the on-demand /cost dialog. This adds an end-of-run summary covering the
interaction (tool calls, success rate), per-model token usage, and total
cost.

The recap is derived from the existing session via a new Session.Stats()
helper (tool-call success/failure counts, per-model token and cost
breakdown, cache savings, duration). It is written to stderr so stdout
stays machine-consumable, and is skipped in JSON output mode.

Refs #1345
@Sayt-0 Sayt-0 requested a review from a team as a code owner June 26, 2026 17:04
@dgageot

dgageot commented Jun 26, 2026

Copy link
Copy Markdown
Member

@Sayt-0 add a --summary to enable this feature

@docker-agent docker-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

One medium-severity issue found in the new Session.Stats() implementation. The overall PR structure is sound — stderr output, JSON mode guard, no-op for empty sessions, and correct token math are all well-implemented.

Comment thread pkg/session/stats.go
switch {
case item.IsMessage():
m := item.Message.Message
st.Cost += m.Cost

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Potential cost double-counting in Stats() remote-mode fallback

st.Cost += m.Cost is accumulated unconditionally for every message during the walk (line 113), including assistant messages where m.Usage == nil. However, st.Requests is only incremented when m.Usage != nil. In remote mode, usage is stored in MessageUsageHistory rather than on the message object — so it is architecturally possible for an assistant message to carry m.Cost > 0 while m.Usage == nil.

When this happens:

  1. The walk adds the message's cost to st.Cost (via line 113).
  2. After the walk, st.Requests == 0 → the remote-mode fallback fires.
  3. The fallback also adds r.Cost for the same call from MessageUsageHistory.

The cost is counted twice. The fix is either to skip the message-level cost accumulation when m.Usage == nil (since that cost will be covered by the fallback path), or to zero out st.Cost before entering the fallback loop:

// Option A — only accumulate m.Cost when usage is present:
case chat.MessageRoleAssistant:
    if m.Usage != nil {
        st.Cost += m.Cost  // move cost inside the usage guard
        st.Requests++
        // ...
    }

// Option B — reset before fallback:
if st.Requests == 0 && len(s.MessageUsageHistory) > 0 {
    st.Cost = 0  // discard costs collected without usage data
    for _, r := range s.MessageUsageHistory { ... }
}

@Sayt-0 Sayt-0 closed this Jun 26, 2026
@Sayt-0 Sayt-0 deleted the feat/exec-session-summary branch June 26, 2026 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants